Project: Computer Vision Capstone Project:

Distracted Driver Posture Classification


The Center for Disease Control and Prevention (CDC) found that nearly one in five vehicle accidents are caused due to distracted driving [1]. This statistics has led to more than 3,000 fatal injuries and 425,000 injuries every year in the USA. This project is taken from a Kaggle competition [2] and the purpose of this project is to classify the action of the drivers which they are doing in the provided images and whether they are distracted or not[3]. The major cause of these accidents was the use of mobile phones. The National Highway Traffic Safety Administration (NHTSA) defines distracted driving as “any activity that diverts attention from driving”, including: a) talking or texting on one’s phone, b) eating and drinking, c) talking to passengers, or d) fiddling with the stereo, entertainment, or navigation system [4]. The CDC provides a broader definition of distracted driving by taking into account visual (i.e. taking one’s eyes off the road), manual (i.e. taking one’s hands off the driving wheel) and cognitive (i.e. taking one’s mind off driving) causes [5]. State Farm created a computer vision competition on Kaggle, a platform that provides data science projects and company sponsored competitions. The company is challenging competitors to classify the driver’s behavior.


Data is a collection of 10 different states of drivers containing one safe driving and 9 other distracted modes. The dataset is provided by State Farm through Kaggle which can be downloaded from here. This project is sponsored by State Farm on Kaggle website. State Farm aims to reduce these alarming statistics, and better insure their customers, by detecting the driver’s distracting activity from the dashboard cameras. Given a dataset of 2D dashboard camera images, State Farm is challenging Kagglers to classify each driver's behavior [2]. Are they driving attentively, wearing their seatbelt, or taking a selfie with their friends in the backseat? The dataset consists of an “imgs” folder that has a test and train folder of 640 x 480 jpg files. The images were taken by a dashboard camera. Each image consists of a driver performing a task from one of the distracted tasks. There are no duplicate images in the dataset. Also State Farm removed metadata from each image (e.g. creation dates). State Farm set up these experiments in a controlled environment. While performing each task, the drivers were not driving as a truck dragged the car on the streets. Below I listed the number of files for each category in the training data and the test data.

In total, there are 22,424 training examples. “Safe driving” has the most examples, and “Hair and makeup” the least. It makes sense to have a lot of “safe driving“examples because State farm is in general interested in finding out if the driver is driving safely or not. They would rather have false negatives than false positives when labeling safe driving. More examples would improve the classifier’s performance in decreasing false positives for safe driving. “Hair and makeup” may be less because it is a task mostly performed by women.


The State Farm supplied the data for a Kaggle challenge. The dataset consists of 22,400 training and 79,727 testing images (640 × 480 with RGB colors) of drivers either driving attentively or doing one of 9 classes of distracting behaviors [3] [8].


The project utilizes the following dependencies:

  • Python 3.6: Tensorflow, Keras, Numpy
  • NVIDIA Geforce GTX1050 GPU, CUDA, CuDNN

1. Import dependencies

Check GPU and CPU device availability.

from tensorflow.python.client import device_lib
[x.physical_device_desc for x in device_lib.list_local_devices() if x.device_type == 'GPU']

[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
incarnation: 12791665773499620215

Basic import, if anything needed during the development process, dump here.

import math, os, sys
import pickle
from glob import glob
import numpy as np
from numpy.random import random, permutation, randn, normal
from matplotlib import pyplot as plt
import plotly as py
import plotly.plotly as ploty
from plotly.offline import download_plotlyjs, init_notebook_mode, plot,iplot
import plotly.graph_objs as go
%matplotlib inline
import PIL
from PIL import Image
import bcolz
from shutil import copyfile
from shutil import move
from sklearn.metrics import confusion_matrix, f1_score
from sklearn.preprocessing import Imputer, LabelEncoder, OneHotEncoder, StandardScaler
import itertools
import random
from extract_bottleneck_features import *
import cv2 

import keras
from keras import backend as K
from keras.utils.data_utils import get_file
from keras.models import Sequential, Model
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers import Input,  GlobalAveragePooling2D, AveragePooling2D, GlobalAveragePooling1D, GlobalMaxPooling2D
from keras.layers.convolutional import Conv2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD, RMSprop, Adam
from keras.preprocessing import image
from keras.layers.normalization import BatchNormalization
from keras.utils.np_utils import to_categorical
from keras.metrics import categorical_crossentropy
from keras.regularizers import l2,l1

Create some handy functions

def get_batches(dirname, gen=image.ImageDataGenerator(), shuffle=True, 
                batch_size=1, target_size=(224,224), class_mode='categorical'):
    return gen.flow_from_directory(path+dirname, target_size, 
                class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)

def plots(ims, figsize=(12,6), rows=1, interp=False, titles=None):
    if type(ims[0]) is np.ndarray:
        ims = np.array(ims).astype(np.uint8)
        if (ims.shape[-1] != 3):
            ims = ims.transpose((0,2,3,1))
    f = plt.figure(figsize=figsize)
    for i in range(len(ims)):
        sp = f.add_subplot(rows, len(ims)//rows, i+1)
        if titles is not None:
            sp.set_title(titles[i], fontsize=16)
        plt.imshow(ims[i], interpolation=None if interp else 'none')

def get_classes(path):
    batches = get_batches('train', shuffle=False, batch_size=1)
    val_batches = get_batches('valid', shuffle=False, batch_size=1)
    #test_batches = get_batches('test', shuffle=False, batch_size=1)
    return (val_batches.classes, 
            # test_batches.filenames)

def get_data(path, target_size = (224,224), batch_size=batch_size):
    batches = get_batches(path, shuffle=False, batch_size=1, class_mode=None, target_size=target_size)
    return np.concatenate([ for i in range (len(batches.classes))])

def save_array(fname, arr):
    c=bcolz.carray(arr, rootdir=fname, mode='w')
def load_array(fname):

def plot_confusion_matrix(cm, classes,
                          title='Confusion matrix',
    This function prints and plots the confusion matrix.
    Normalization can be applied by setting `normalize=True`.
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
        print('Confusion matrix, without normalization')

    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)

    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt),
                 color="white" if cm[i, j] > thresh else "black")

    plt.ylabel('True label')
    plt.xlabel('Predicted label')
def plot_history(Exp_history):
    plt.legend(['train', 'validation'], loc='upper left')
    plt.legend(['train', 'validation'], loc='upper left')
    #plt.savefig(results_path+'/train_history/Exp_train_history_1.png', bbox_inches='tight')

def plot_acc(Exp_history):
    plt.legend(['train', 'validation'], loc='upper left')
    #plt.savefig(results_path+'/train_history/Exp_train_history_1.png', bbox_inches='tight')

1. Data preparation and exploration

As provided, the train dataset contains the following categories of driving states:

  • c0: safe driving
  • c1: texting - right
  • c2: talking on the phone - right
  • c3: texting - left
  • c4: talking on the phone - left
  • c5: operating the radio
  • c6: drinking
  • c7: reaching behind
  • c8: hair and makeup
  • c9: talking to passenger

1.1 Prepare data directories

As general, we will distribute the training data into train and validation sets.

#%cd project


current_dir = os.getcwd()
PROJECT_DIR = current_dir
path = current_dir+'/imgs/'
test_path = path + '/test/' #We use all the test data
train_path = path + '/train/'
result_path = path + '/results/'
valid_path = path + '/valid/'

WARNING: These lines are only to run once.

'''%cd $path
%mkdir valid
%mkdir results
%mkdir models'''

'%cd $path\n%mkdir valid\n%mkdir results\n%mkdir models'

'''# Creating validation set
%cd $valid_path
%mkdir c0
%mkdir c1
%mkdir c2
%mkdir c3
%mkdir c4
%mkdir c5
%mkdir c6
%mkdir c7
%mkdir c8
%mkdir c9
%cd $path
%cd $path'''

'# Creating validation set\n%cd $valid_path\n%mkdir c0\n%mkdir c1\n%mkdir c2\n%mkdir c3\n%mkdir c4\n%mkdir c5\n%mkdir c6\n%mkdir c7\n%mkdir c8\n%mkdir c9\n%cd $path\n%cd $path'

%cd $path


class_labels = ['c0','c1','c2','c3','c4','c5','c6','c7','c8','c9']

for i in class_labels:
    print ('label {0} has {1:5d} images'.format(i,len([name for name in os.listdir(train_path+i) 
                                                         if os.path.isfile(os.path.join(train_path+i, name))])))

label c0 has  1989 images
label c1 has  1767 images
label c2 has  1817 images
label c3 has  1846 images
label c4 has  1826 images
label c5 has  1812 images
label c6 has  1825 images
label c7 has  1502 images
label c8 has  1152 images
label c9 has  1238 images

summ = float(0)
for i in class_labels:
    summ = summ +len([name for name in os.listdir(train_path+i) if os.path.isfile(os.path.join(train_path+i, name))])


There are around 2000 images for each categories. It is probably a good idea to move 25% of images (500 images for each categories) to validation sets.

%cd $train_path


'''# moving ~20% data from train sets to validation sets
for label in class_labels:
    g = glob(label+'/*.jpg')
    shuffle = np.random.permutation(g)
    for i in range(500): move(shuffle[i], valid_path+shuffle[i])'''

"# moving ~20% data from train sets to validation sets\nfor label in class_labels:\n    g = glob(label+'/*.jpg')\n    shuffle = np.random.permutation(g)\n    for i in range(500): move(shuffle[i], valid_path+shuffle[i])"

summ = float(0)
for i in class_labels:
    summ = summ +len([name for name in os.listdir(valid_path+i) if os.path.isfile(os.path.join(valid_path+i, name))])


summ = float(0)
summ = summ +len([name for name in os.listdir(test_path) if os.path.isfile(os.path.join(test_path, name))])


End Warning

1.2 Visualization

batches = get_batches('train', batch_size=6)
imgs,labels = next(batches)
# Plot randomly 6 images
plots(imgs, titles=labels, figsize=(20,10), rows =2)

Found 16774 images belonging to 10 classes.

batches = get_batches('valid', batch_size=6)
imgs,labels = next(batches)

# Plot randomly 6 images
plots(imgs, titles=labels, figsize=(20,10), rows =2)

Found 5000 images belonging to 10 classes.

1.3 Batches preparation

Batches as direct inputs

WARNING: These lines are only to run once.

# Roll into pixcel matrix
train_data = get_data('train')
valid_data = get_data('valid')

Found 16774 images belonging to 10 classes.
Found 5000 images belonging to 10 classes.

%cd $path


(valid_classes, train_classes, valid_labels, train_labels, valid_filenames, train_filenames) = get_classes(path)

Found 16774 images belonging to 10 classes.
Found 5000 images belonging to 10 classes.

save_array('results/train_data.dat', train_data)
save_array('results/valid_data.dat', valid_data)


Loading labels and stored data.

test_data = get_data('test')

Found 6 images belonging to 1 classes.

%cd $path


1.3 Summary

2. Experiements

We will start with the simpliest model: a fully connected network with no hidden layer, i.e., linear model. This is to provide a benchmark for subsequence development.


  • We used batchnormalization right at the input layer to avoid any domination input values that could skew the output.
  • We activated the output with a softmax layer for 10 classes.
  • We will 224x224 input shape, as the results we will have 1.5+ million parametters and easily overfitted with a linear model, hence, l2 regularization is used to minimize impact of overfitting.

2.1 Linear model

Linear_model = Sequential([
        BatchNormalization(axis=-1, input_shape=(224,224,3)),
        Dense(10, activation='softmax')

Linear_model.compile(Adam(lr=10e-5), loss='categorical_crossentropy', metrics=['accuracy'])

Layer (type)                 Output Shape              Param #   
batch_normalization_2 (Batch (None, 224, 224, 3)       12        
flatten_2 (Flatten)          (None, 150528)            0         
dense_2 (Dense)              (None, 10)                1505290   
Total params: 1,505,302
Trainable params: 1,505,296
Non-trainable params: 6

if 'session' in locals() and session is not None:
    print('Close interactive session')

import tensorflow as tf

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=.2, allow_growth = 1)

Training 10 epochs.

Ex1_history =,train_labels, batch_size=batch_size, epochs=5,
                               validation_data =(valid_data,valid_labels))

Train on 16774 samples, validate on 5000 samples
Epoch 1/5
16774/16774 [==============================] - 175s - loss: 0.8099 - acc: 0.7950 - val_loss: 0.6078 - val_acc: 0.8584
Epoch 2/5
16774/16774 [==============================] - 175s - loss: 0.2166 - acc: 0.9360 - val_loss: 0.2855 - val_acc: 0.9360
Epoch 3/5
16774/16774 [==============================] - 170s - loss: 0.1774 - acc: 0.9512 - val_loss: 0.1343 - val_acc: 0.9650
Epoch 4/5
16774/16774 [==============================] - 175s - loss: 0.1297 - acc: 0.9683 - val_loss: 0.2260 - val_acc: 0.9468
Epoch 5/5
16774/16774 [==============================] - 169s - loss: 0.1140 - acc: 0.9697 - val_loss: 0.1852 - val_acc: 0.9618

# list all data in history

dict_keys(['val_loss', 'val_acc', 'loss', 'acc'])

predictions = Linear_model.predict(train_data)
predictions = np.argmax(predictions, axis=-1)

array([0, 0, 0, ..., 9, 9, 9], dtype=int64)

train_labels2 = np.argmax(train_labels, axis=-1)

In [549]:
f1_score(train_labels2, predictions, average=None)

array([ 0.98,  0.98,  0.98,  0.99,  0.99,  0.99,  0.97,  1.  ,  0.98,  0.99])

valid_labels2 = np.argmax(valid_labels, axis=-1)

predictions_v = Linear_model.predict(valid_data)
predictions_v = np.argmax(predictions_v, axis=-1)

array([0, 0, 0, ..., 9, 9, 9], dtype=int64)

array([0, 0, 0, ..., 9, 9, 9], dtype=int64)

f1_score(valid_labels2, predictions_v, average=None)

array([ 0.96,  0.96,  0.97,  0.98,  0.98,  0.97,  0.94,  0.99,  0.94,  0.97])

(5000, 224, 224, 3)

In [416]:

(3, 224, 224, 3)

cnf_matrix = confusion_matrix(train_labels2, predictions)

# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
                      title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
                      title='Normalized confusion matrix')

Confusion matrix, without normalization
[[1905   27    3    1    4   26   21    0    1    1]
 [   0 1767    0    0    0    0    0    0    0    0]
 [   0   14 1777    1    0    1   23    1    0    0]
 [   0   21    3 1813    4    2    3    0    0    0]
 [   0    4    0    2 1792    3   24    1    0    0]
 [   1    0    0    0    0 1809    0    0    2    0]
 [   0    1    0    0    0    2 1822    0    0    0]
 [   0    1    1    0    0    1    3 1495    1    0]
 [   0    3    3    2    2    7   20    2 1111    2]
 [   0    2    6    1    0    3    7    2    3 1214]]
Normalized confusion matrix
[[  9.58e-01   1.36e-02   1.51e-03   5.03e-04   2.01e-03   1.31e-02
    1.06e-02   0.00e+00   5.03e-04   5.03e-04]
 [  0.00e+00   1.00e+00   0.00e+00   0.00e+00   0.00e+00   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  0.00e+00   7.71e-03   9.78e-01   5.50e-04   0.00e+00   5.50e-04
    1.27e-02   5.50e-04   0.00e+00   0.00e+00]
 [  0.00e+00   1.14e-02   1.63e-03   9.82e-01   2.17e-03   1.08e-03
    1.63e-03   0.00e+00   0.00e+00   0.00e+00]
 [  0.00e+00   2.19e-03   0.00e+00   1.10e-03   9.81e-01   1.64e-03
    1.31e-02   5.48e-04   0.00e+00   0.00e+00]
 [  5.52e-04   0.00e+00   0.00e+00   0.00e+00   0.00e+00   9.98e-01
    0.00e+00   0.00e+00   1.10e-03   0.00e+00]
 [  0.00e+00   5.48e-04   0.00e+00   0.00e+00   0.00e+00   1.10e-03
    9.98e-01   0.00e+00   0.00e+00   0.00e+00]
 [  0.00e+00   6.66e-04   6.66e-04   0.00e+00   0.00e+00   6.66e-04
    2.00e-03   9.95e-01   6.66e-04   0.00e+00]
 [  0.00e+00   2.60e-03   2.60e-03   1.74e-03   1.74e-03   6.08e-03
    1.74e-02   1.74e-03   9.64e-01   1.74e-03]
 [  0.00e+00   1.62e-03   4.85e-03   8.08e-04   0.00e+00   2.42e-03
    5.65e-03   1.62e-03   2.42e-03   9.81e-01]]

acts = ["driving safely",
           "texting - right",
            "talking on the phone - right",
            "texting - left",
            "talking on the phone - left",
            "operating the radio",
            "reaching behind", 
            "doing hair and makeup",
            "talking to passenger"]
batches = get_data("test", batch_size=2)
batch = get_batches("test", batch_size=2)
for i in batches:
    predictions_test = Linear_model.predict(i.reshape(1,224, 224, 3)) 
    predictions_test = np.argmax(predictions_test, axis=-1)
    predictions_test = np.array(predictions_test, np.int32)
    print (predictions_test)
    print("dricer is {}".format(acts[predictions_test[0]]))

Found 6 images belonging to 1 classes.
Found 6 images belonging to 1 classes.
dricer is operating the radio
dricer is doing hair and makeup
dricer is texting - right
dricer is talking to passenger
dricer is driving safely
dricer is texting - left

def driver_activity_detector(img_path, batch_size=batch_size):
    acts = ["driving safely",
           "texting - right",
            "talking on the phone - right",
            "texting - left",
            "talking on the phone - left",
            "operating the radio",
            "reaching behind", 
            "doing hair and makeup",
            "talking to passenger"]
    activity = driver_activity(img_path)
    for i in batches:
        activity = Linear_model.predict(i.reshape(1,224, 224, 3)) 
        activity = np.argmax(activity, axis=-1)
        print("dricer is {}".format(acts[activity]))
    # Display the image
    batches = get_batches(img_path, batch_size=batch_size)
    imgs,j = next(batches)
    plots(imgs, titles=j, figsize=(20,10), rows =2)

%cd ..


In [129]:

It can be seen that the linear model with batchnormalization and some l2 regulation actually work pretty well. I can achive validation accuracy of around 94% on the validation dataset. However, the validation accuracy is not stable (even training accuracy is not stable), it means that model will not generalize very well but for starting model, this is very encouraging and it is clearly much better than random guess.

Next I will try to stablize the validation accuracy with convolutional networks.

2.2 Simple convolutional layer network

Next, I experiment a neural network with 2 convolutional layers. This experiment will give us an idea on how this dataset behave under convolutional actions. I will try to overfitting and later adding some regularization or data augmentation.

CNN_simple = Sequential([
            Conv2D(32,(3,3), activation='relu'),
            Conv2D(64,(3,3), activation='relu'),
            Dense(200, activation='relu'),
            Dense(10, activation='softmax')

CNN_simple.compile(Adam(lr=10e-5), loss='categorical_crossentropy', metrics=['accuracy'])

Layer (type)                 Output Shape              Param #   
batch_normalization_69 (Batc (None, 224, 224, 3)       12        
conv2d_35 (Conv2D)           (None, 222, 222, 32)      896       
batch_normalization_70 (Batc (None, 222, 222, 32)      128       
max_pooling2d_34 (MaxPooling (None, 74, 74, 32)        0         
conv2d_36 (Conv2D)           (None, 72, 72, 64)        18496     
batch_normalization_71 (Batc (None, 72, 72, 64)        256       
max_pooling2d_35 (MaxPooling (None, 24, 24, 64)        0         
flatten_19 (Flatten)         (None, 36864)             0         
dense_28 (Dense)             (None, 200)               7373000   
batch_normalization_72 (Batc (None, 200)               800       
dense_29 (Dense)             (None, 10)                2010      
Total params: 7,395,598
Trainable params: 7,395,000
Non-trainable params: 598

Ex2_history =,train_labels, batch_size=batch_size, epochs=5, 
                             validation_data =(valid_data,valid_labels))

Train on 16774 samples, validate on 5000 samples
Epoch 1/5
16774/16774 [==============================] - 2489s - loss: 0.2583 - acc: 0.9329 - val_loss: 0.1536 - val_acc: 0.9730
Epoch 2/5
16774/16774 [==============================] - 2408s - loss: 0.0248 - acc: 0.9964 - val_loss: 0.0615 - val_acc: 0.9850
Epoch 3/5
16774/16774 [==============================] - 2396s - loss: 0.0132 - acc: 0.9987 - val_loss: 0.0189 - val_acc: 0.9968
Epoch 4/5
16774/16774 [==============================] - 2341s - loss: 0.0051 - acc: 0.9996 - val_loss: 0.0126 - val_acc: 0.9964
Epoch 5/5
16774/16774 [==============================] - 2283s - loss: 0.0020 - acc: 1.0000 - val_loss: 0.0089 - val_acc: 0.9978

predictions_Ex2 = CNN_simple.predict(train_data)
predictions_Ex2 = np.argmax(predictions_Ex2, axis=-1)
train_labels_Ex2 = np.argmax(train_labels, axis=-1)
f1_score(train_labels_Ex2, predictions_Ex2, average=None)

array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

predictions_Ex2_v = CNN_simple.predict(valid_data)
predictions_Ex2_v = np.argmax(predictions_Ex2_v, axis=-1)
valid_labels_Ex2 = np.argmax(valid_labels, axis=-1)
f1_score(valid_labels_Ex2, predictions_Ex2_v, average=None)

array([ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ,  1.  ,  1.  ,  1.  ,  0.99,  0.99])

cnf_matrix = confusion_matrix(train_labels_Ex2, predictions_Ex2)

# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
                      title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
                      title='Normalized confusion matrix')

Confusion matrix, without normalization
[[1989    0    0    0    0    0    0    0    0    0]
 [   0 1767    0    0    0    0    0    0    0    0]
 [   0    0 1817    0    0    0    0    0    0    0]
 [   0    0    0 1846    0    0    0    0    0    0]
 [   0    0    0    0 1826    0    0    0    0    0]
 [   0    0    0    0    0 1812    0    0    0    0]
 [   0    0    0    0    0    0 1825    0    0    0]
 [   0    0    0    0    0    0    0 1502    0    0]
 [   0    0    0    0    0    0    0    0 1152    0]
 [   0    0    0    0    0    0    0    0    0 1238]]
Normalized confusion matrix
[[ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]]

cnf_matrix = confusion_matrix(valid_labels_Ex2, predictions_Ex2_v)

# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
                      title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
                      title='Normalized confusion matrix')

Confusion matrix, without normalization
[[500   0   0   0   0   0   0   0   0   0]
 [  0 500   0   0   0   0   0   0   0   0]
 [  0   0 500   0   0   0   0   0   0   0]
 [  0   0   0 500   0   0   0   0   0   0]
 [  0   0   0   0 500   0   0   0   0   0]
 [  1   0   0   0   0 497   1   0   1   0]
 [  0   0   0   0   0   0 500   0   0   0]
 [  0   0   0   0   0   0   0 500   0   0]
 [  0   0   0   0   0   0   1   0 495   4]
 [  0   0   0   0   0   0   0   0   3 497]]
Normalized confusion matrix
[[ 1.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    1.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    1.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    1.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    1.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.99  0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.    1.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.    0.    1.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.    0.    0.    0.99  0.01]
 [ 0.    0.    0.    0.    0.    0.    0.    0.    0.01  0.99]]

With 4 epochs, I was able to get to 99% accuracy, however, It is slightly overfitting the model, I will add some reularization or specifically dropout to see if I can stablize the accuracy reading a bit more.

2.2 Simple convolutional layer network with dropout

Since I will add dropout, e.i., losing some information in order to simplify the model so that we can generalize better but losing information also cause the network less sophisticated, to have a "good" trade-off, I add one more convolutional layer.

CNN_dropout = Sequential([
        Conv2D(32,(3,3), activation='relu'),
        Conv2D(64,(3,3), activation='relu'),
        Conv2D(128,(3,3), activation='relu'),
        Conv2D(256,(3,3), activation='relu'),
        Dense(200, activation='relu'),
        Dense(200, activation='relu'),
        Dense(10, activation='softmax')
CNN_dropout.compile(Adam(lr=10e-5), loss='categorical_crossentropy', metrics=['accuracy'])

Layer (type)                 Output Shape              Param #   
batch_normalization_117 (Bat (None, 224, 224, 3)       12        
conv2d_60 (Conv2D)           (None, 222, 222, 32)      896       
batch_normalization_118 (Bat (None, 222, 222, 32)      128       
max_pooling2d_66 (MaxPooling (None, 111, 111, 32)      0         
conv2d_61 (Conv2D)           (None, 109, 109, 64)      18496     
batch_normalization_119 (Bat (None, 109, 109, 64)      436       
max_pooling2d_67 (MaxPooling (None, 54, 54, 64)        0         
conv2d_62 (Conv2D)           (None, 52, 52, 128)       73856     
batch_normalization_120 (Bat (None, 52, 52, 128)       512       
max_pooling2d_68 (MaxPooling (None, 26, 26, 128)       0         
conv2d_63 (Conv2D)           (None, 24, 24, 256)       295168    
batch_normalization_121 (Bat (None, 24, 24, 256)       1024      
max_pooling2d_69 (MaxPooling (None, 12, 12, 256)       0         
average_pooling2d_1 (Average (None, 6, 6, 256)         0         
flatten_27 (Flatten)         (None, 9216)              0         
dense_51 (Dense)             (None, 200)               1843400   
batch_normalization_122 (Bat (None, 200)               800       
dropout_15 (Dropout)         (None, 200)               0         
dense_52 (Dense)             (None, 200)               40200     
batch_normalization_123 (Bat (None, 200)               800       
dropout_16 (Dropout)         (None, 200)               0         
dense_53 (Dense)             (None, 10)                2010      
Total params: 2,277,738
Trainable params: 2,275,882
Non-trainable params: 1,856

Ex3_history =,train_labels, batch_size=batch_size, epochs=2, 
                              validation_data =(valid_data,valid_labels))

Train on 16774 samples, validate on 5000 samples
Epoch 1/2
16774/16774 [==============================] - 237s 14ms/step - loss: -105.7810 - acc: 0.6892 - val_loss: -33.3225 - val_acc: 0.9612
Epoch 2/2
16774/16774 [==============================] - 288s 17ms/step - loss: -185.9037 - acc: 0.9785 - val_loss: -47.0898 - val_acc: 0.9930

Ex3_history =,train_labels, batch_size=batch_size, epochs=5, 
                              validation_data =(valid_data,valid_labels))

Train on 16774 samples, validate on 5000 samples
Epoch 1/5
16774/16774 [==============================] - 4820s - loss: 1.6404 - acc: 0.5048 - val_loss: 0.5544 - val_acc: 0.8476
Epoch 2/5
16774/16774 [==============================] - 4894s - loss: 0.5402 - acc: 0.8281 - val_loss: 0.1174 - val_acc: 0.9742
Epoch 3/5
16774/16774 [==============================] - 5652s - loss: 0.2553 - acc: 0.9250 - val_loss: 0.0699 - val_acc: 0.9836
Epoch 4/5
16774/16774 [==============================] - 5774s - loss: 0.1653 - acc: 0.9534 - val_loss: 0.0375 - val_acc: 0.9900
Epoch 5/5
16774/16774 [==============================] - 5587s - loss: 0.1119 - acc: 0.9683 - val_loss: 0.0555 - val_acc: 0.9842

predictions_Ex3 = CNN_dropout.predict(train_data)
predictions_Ex3 = np.argmax(predictions_Ex3, axis=-1)
train_labels_Ex3 = np.argmax(train_labels, axis=-1)
f1_score(train_labels_Ex3, predictions_Ex3, average=None)

array([ 0.99,  1.  ,  1.  ,  1.  ,  0.99,  0.99,  0.99,  1.  ,  0.98,  0.99])

predictions_Ex3_v = CNN_dropout.predict(valid_data)
predictions_Ex3_v = np.argmax(predictions_Ex3_v, axis=-1)
valid_labels_Ex3 = np.argmax(valid_labels, axis=-1)
f1_score(valid_labels_Ex3, predictions_Ex3_v, average=None)

array([ 0.97,  1.  ,  0.99,  1.  ,  0.99,  0.99,  0.98,  0.99,  0.96,  0.97])

cnf_matrix = confusion_matrix(train_labels_Ex3, predictions_Ex3)

# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
                      title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
                      title='Normalized confusion matrix')

Confusion matrix, without normalization
[[1983    0    0    0    4    0    0    0    0    2]
 [   2 1763    0    0    0    0    0    0    0    2]
 [   0    1 1812    0    1    0    0    1    1    1]
 [   3    0    0 1841    2    0    0    0    0    0]
 [   0    0    0    3 1823    0    0    0    0    0]
 [   2    0    1    0    0 1809    0    0    0    0]
 [   0    4    2    0    9   17 1788    0    1    4]
 [   2    0    0    0    0    0    0 1500    0    0]
 [  13    0    3    0    4    1    0    4 1120    7]
 [   9    0    0    0    1    1    0    0    2 1225]]
Normalized confusion matrix
[[  9.97e-01   0.00e+00   0.00e+00   0.00e+00   2.01e-03   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   1.01e-03]
 [  1.13e-03   9.98e-01   0.00e+00   0.00e+00   0.00e+00   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   1.13e-03]
 [  0.00e+00   5.50e-04   9.97e-01   0.00e+00   5.50e-04   0.00e+00
    0.00e+00   5.50e-04   5.50e-04   5.50e-04]
 [  1.63e-03   0.00e+00   0.00e+00   9.97e-01   1.08e-03   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  0.00e+00   0.00e+00   0.00e+00   1.64e-03   9.98e-01   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  1.10e-03   0.00e+00   5.52e-04   0.00e+00   0.00e+00   9.98e-01
    0.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  0.00e+00   2.19e-03   1.10e-03   0.00e+00   4.93e-03   9.32e-03
    9.80e-01   0.00e+00   5.48e-04   2.19e-03]
 [  1.33e-03   0.00e+00   0.00e+00   0.00e+00   0.00e+00   0.00e+00
    0.00e+00   9.99e-01   0.00e+00   0.00e+00]
 [  1.13e-02   0.00e+00   2.60e-03   0.00e+00   3.47e-03   8.68e-04
    0.00e+00   3.47e-03   9.72e-01   6.08e-03]
 [  7.27e-03   0.00e+00   0.00e+00   0.00e+00   8.08e-04   8.08e-04
    0.00e+00   0.00e+00   1.62e-03   9.89e-01]]

cnf_matrix = confusion_matrix(valid_labels_Ex3, predictions_Ex3_v)

# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
                      title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
                      title='Normalized confusion matrix')

Confusion matrix, without normalization
[[494   0   0   0   1   2   0   0   0   3]
 [  1 499   0   0   0   0   0   0   0   0]
 [  0   1 498   0   1   0   0   0   0   0]
 [  1   0   0 497   2   0   0   0   0   0]
 [  0   0   0   0 500   0   0   0   0   0]
 [  1   0   0   0   0 498   0   0   0   1]
 [  1   1   3   0   2   7 481   1   1   3]
 [  0   0   1   0   0   0   0 499   0   0]
 [  9   0   3   0   2   3   0   5 465  13]
 [  9   0   0   0   0   0   0   1   0 490]]
Normalized confusion matrix
[[ 0.99  0.    0.    0.    0.    0.    0.    0.    0.    0.01]
 [ 0.    1.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    1.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.99  0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    1.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    1.    0.    0.    0.    0.  ]
 [ 0.    0.    0.01  0.    0.    0.01  0.96  0.    0.    0.01]
 [ 0.    0.    0.    0.    0.    0.    0.    1.    0.    0.  ]
 [ 0.02  0.    0.01  0.    0.    0.01  0.    0.01  0.93  0.03]
 [ 0.02  0.    0.    0.    0.    0.    0.    0.    0.    0.98]]

2.4. Using vgg16 model without pre-trained weights

  • Preprocessing input data specifically for vgg network
  • Put common blocks into functions
  • Add batchnormalixation layers (since vgg16 was invented before the introduction of batchnorn , original vgg16 does not contain batchnorm layers)

vgg16 model construction

Batches as outputs from vgg16 pretained weights

I will leverage the pre-trained imageNet weights available for vgg16 models.

To do this, it is easier to precompute outputs of vgg16's convolutional part and use these outputs as our inputs to our models. This process will help to save time as well as memory.

from keras.applications.vgg16 import VGG16
vgg_model_orig = VGG16(weights='imagenet', include_top=True)

In [55]:
# Pop all the layers in Vgg model
layers = vgg_model_orig.layers
# Find all the index of Convolution2D layers
Con2D_layer_idx = [index for index,layer in enumerate(layers) if type(layer) is Conv2D]
# Call the index of the last Convolution2D layer
layer_idx = Con2D_layer_idx[-1]

[1, 2, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17]

TensorShape([Dimension(None), Dimension(1000)])

# Create convolutional model up to layer 30
conv_model = Sequential(layers[:layer_idx+1])

Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 224, 224, 3)       0         
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0

x = conv_model.output
x = MaxPooling2D()(x)
x = AveragePooling2D()(x)
x = MaxPooling2D()(x)

x = Flatten()(x)

x = Dense(4096, activation='relu')(x)
x = BatchNormalization()(x)

x = Dropout(0.5)(x)

x = Dense(4096, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)

output = Dense(10, activation='softmax')(x) # output 10 classes
vgg_model = Model(inputs = conv_model.input, outputs = output)

# Set all the convolutional layers to nontrainable
for layer in conv_model.layers:
    layer.trainable = False

Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 224, 224, 3)       0         
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
max_pooling2d_89 (MaxPooling (None, 7, 7, 512)         0         
average_pooling2d_15 (Averag (None, 3, 3, 512)         0         
max_pooling2d_90 (MaxPooling (None, 1, 1, 512)         0         
flatten_38 (Flatten)         (None, 512)               0         
dense_74 (Dense)             (None, 4096)              2101248   
batch_normalization_138 (Bat (None, 4096)              16384     
dropout_29 (Dropout)         (None, 4096)              0         
dense_75 (Dense)             (None, 4096)              16781312  
batch_normalization_139 (Bat (None, 4096)              16384     
dropout_30 (Dropout)         (None, 4096)              0         
dense_76 (Dense)             (None, 10)                40970     
Total params: 33,670,986
Trainable params: 18,939,914
Non-trainable params: 14,731,072

vgg_model.compile(Adam(lr=10e-5), loss='categorical_crossentropy', metrics=['accuracy'])

batch_size = 1
Ex41_history =,train_labels, batch_size=batch_size, epochs=3, 
                              validation_data =(valid_data,valid_labels))

Train on 16774 samples, validate on 5000 samples
Epoch 1/3
16774/16774 [==============================] - 14618s - loss: 2.2926 - acc: 0.1146 - val_loss: 14.4584 - val_acc: 0.1028
Epoch 2/3
16774/16774 [==============================] - 15513s - loss: 2.2930 - acc: 0.1118 - val_loss: 14.4540 - val_acc: 0.1026
Epoch 3/3
16774/16774 [==============================] - 14967s - loss: 2.2929 - acc: 0.1108 - val_loss: 14.3356 - val_acc: 0.1098

In [119]:

In [120]:

predictions_Ex41 = vgg_model.predict(train_data)
predictions_Ex41 = np.argmax(predictions_Ex41, axis=-1)
train_labels_Ex41 = np.argmax(train_labels, axis=-1)
f1_score(train_labels_Ex41, predictions_Ex41, average=None)

array([ 0.2 ,  0.14,  0.  ,  0.12,  0.03,  0.  ,  0.03,  0.12,  0.05,  0.15])

predictions_Ex41_v = vgg_model.predict(valid_data)
predictions_Ex41_v = np.argmax(predictions_Ex41_v, axis=-1)
valid_labels_Ex41 = np.argmax(valid_labels, axis=-1)
f1_score(valid_labels_Ex41, predictions_Ex41_v, average=None)

array([ 0.16,  0.13,  0.01,  0.1 ,  0.03,  0.  ,  0.04,  0.13,  0.07,  0.17])

cnf_matrix = confusion_matrix(train_labels_Ex41, predictions_Ex41)

# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
                      title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
                      title='Normalized confusion matrix')

Confusion matrix, without normalization
[[1983    0    0    0    4    0    0    0    0    2]
 [   2 1763    0    0    0    0    0    0    0    2]
 [   0    1 1812    0    1    0    0    1    1    1]
 [   3    0    0 1841    2    0    0    0    0    0]
 [   0    0    0    3 1823    0    0    0    0    0]
 [   2    0    1    0    0 1809    0    0    0    0]
 [   0    4    2    0    9   17 1788    0    1    4]
 [   2    0    0    0    0    0    0 1500    0    0]
 [  13    0    3    0    4    1    0    4 1120    7]
 [   9    0    0    0    1    1    0    0    2 1225]]
Normalized confusion matrix
[[  9.97e-01   0.00e+00   0.00e+00   0.00e+00   2.01e-03   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   1.01e-03]
 [  1.13e-03   9.98e-01   0.00e+00   0.00e+00   0.00e+00   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   1.13e-03]
 [  0.00e+00   5.50e-04   9.97e-01   0.00e+00   5.50e-04   0.00e+00
    0.00e+00   5.50e-04   5.50e-04   5.50e-04]
 [  1.63e-03   0.00e+00   0.00e+00   9.97e-01   1.08e-03   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  0.00e+00   0.00e+00   0.00e+00   1.64e-03   9.98e-01   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  1.10e-03   0.00e+00   5.52e-04   0.00e+00   0.00e+00   9.98e-01
    0.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  0.00e+00   2.19e-03   1.10e-03   0.00e+00   4.93e-03   9.32e-03
    9.80e-01   0.00e+00   5.48e-04   2.19e-03]
 [  1.33e-03   0.00e+00   0.00e+00   0.00e+00   0.00e+00   0.00e+00
    0.00e+00   9.99e-01   0.00e+00   0.00e+00]
 [  1.13e-02   0.00e+00   2.60e-03   0.00e+00   3.47e-03   8.68e-04
    0.00e+00   3.47e-03   9.72e-01   6.08e-03]
 [  7.27e-03   0.00e+00   0.00e+00   0.00e+00   8.08e-04   8.08e-04
    0.00e+00   0.00e+00   1.62e-03   9.89e-01]]

cnf_matrix = confusion_matrix(valid_labels_Ex41, predictions_Ex41_v)

# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
                      title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
                      title='Normalized confusion matrix')

Confusion matrix, without normalization
[[494   0   0   0   1   2   0   0   0   3]
 [  1 499   0   0   0   0   0   0   0   0]
 [  0   1 498   0   1   0   0   0   0   0]
 [  1   0   0 497   2   0   0   0   0   0]
 [  0   0   0   0 500   0   0   0   0   0]
 [  1   0   0   0   0 498   0   0   0   1]
 [  1   1   3   0   2   7 481   1   1   3]
 [  0   0   1   0   0   0   0 499   0   0]
 [  9   0   3   0   2   3   0   5 465  13]
 [  9   0   0   0   0   0   0   1   0 490]]
Normalized confusion matrix
[[ 0.99  0.    0.    0.    0.    0.    0.    0.    0.    0.01]
 [ 0.    1.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    1.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.99  0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    1.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    1.    0.    0.    0.    0.  ]
 [ 0.    0.    0.01  0.    0.    0.01  0.96  0.    0.    0.01]
 [ 0.    0.    0.    0.    0.    0.    0.    1.    0.    0.  ]
 [ 0.02  0.    0.01  0.    0.    0.01  0.    0.01  0.93  0.03]
 [ 0.02  0.    0.    0.    0.    0.    0.    0.    0.    0.98]]

2.5 Using InceptionV3 model with transfered learning (pretrained weights)

In this experiment, we will use InceptionV3 model with pre-trained imageNet weights. We will first start with the base model with slightly modification of the top layer to adapt with 10 classes instead 1000 classes.

Base model

from keras.applications.inception_v3 import InceptionV3
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
output = Dense(10, activation='softmax')(x) # output 10 classes
IcepV3_model = Model(inputs=base_model.input, outputs=output)

for layer in base_model.layers:
    layer.trainable = False
IcepV3_model.compile(Adam(lr=10e-5), loss='categorical_crossentropy', metrics=['accuracy'])

Downloading data from
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
Layer (type)                     Output Shape          Param #     Connected to                     
input_4 (InputLayer)             (None, None, None, 3) 0                                            
conv2d_252 (Conv2D)              (None, None, None, 32 864         input_4[0][0]                    
batch_normalization_328 (BatchNo (None, None, None, 32 96          conv2d_252[0][0]                 
activation_189 (Activation)      (None, None, None, 32 0           batch_normalization_328[0][0]    
conv2d_253 (Conv2D)              (None, None, None, 32 9216        activation_189[0][0]             
batch_normalization_329 (BatchNo (None, None, None, 32 96          conv2d_253[0][0]                 
activation_190 (Activation)      (None, None, None, 32 0           batch_normalization_329[0][0]    
conv2d_254 (Conv2D)              (None, None, None, 64 18432       activation_190[0][0]             
batch_normalization_330 (BatchNo (None, None, None, 64 192         conv2d_254[0][0]                 
activation_191 (Activation)      (None, None, None, 64 0           batch_normalization_330[0][0]    
max_pooling2d_99 (MaxPooling2D)  (None, None, None, 64 0           activation_191[0][0]             
conv2d_255 (Conv2D)              (None, None, None, 80 5120        max_pooling2d_99[0][0]           
batch_normalization_331 (BatchNo (None, None, None, 80 240         conv2d_255[0][0]                 
activation_192 (Activation)      (None, None, None, 80 0           batch_normalization_331[0][0]    
conv2d_256 (Conv2D)              (None, None, None, 19 138240      activation_192[0][0]             
batch_normalization_332 (BatchNo (None, None, None, 19 576         conv2d_256[0][0]                 
activation_193 (Activation)      (None, None, None, 19 0           batch_normalization_332[0][0]    
max_pooling2d_100 (MaxPooling2D) (None, None, None, 19 0           activation_193[0][0]             
conv2d_260 (Conv2D)              (None, None, None, 64 12288       max_pooling2d_100[0][0]          
batch_normalization_336 (BatchNo (None, None, None, 64 192         conv2d_260[0][0]                 
activation_197 (Activation)      (None, None, None, 64 0           batch_normalization_336[0][0]    
conv2d_258 (Conv2D)              (None, None, None, 48 9216        max_pooling2d_100[0][0]          
conv2d_261 (Conv2D)              (None, None, None, 96 55296       activation_197[0][0]             
batch_normalization_334 (BatchNo (None, None, None, 48 144         conv2d_258[0][0]                 
batch_normalization_337 (BatchNo (None, None, None, 96 288         conv2d_261[0][0]                 
activation_195 (Activation)      (None, None, None, 48 0           batch_normalization_334[0][0]    
activation_198 (Activation)      (None, None, None, 96 0           batch_normalization_337[0][0]    
average_pooling2d_34 (AveragePoo (None, None, None, 19 0           max_pooling2d_100[0][0]          
conv2d_257 (Conv2D)              (None, None, None, 64 12288       max_pooling2d_100[0][0]          
conv2d_259 (Conv2D)              (None, None, None, 64 76800       activation_195[0][0]             
conv2d_262 (Conv2D)              (None, None, None, 96 82944       activation_198[0][0]             
conv2d_263 (Conv2D)              (None, None, None, 32 6144        average_pooling2d_34[0][0]       
batch_normalization_333 (BatchNo (None, None, None, 64 192         conv2d_257[0][0]                 
batch_normalization_335 (BatchNo (None, None, None, 64 192         conv2d_259[0][0]                 
batch_normalization_338 (BatchNo (None, None, None, 96 288         conv2d_262[0][0]                 
batch_normalization_339 (BatchNo (None, None, None, 32 96          conv2d_263[0][0]                 
activation_194 (Activation)      (None, None, None, 64 0           batch_normalization_333[0][0]    
activation_196 (Activation)      (None, None, None, 64 0           batch_normalization_335[0][0]    
activation_199 (Activation)      (None, None, None, 96 0           batch_normalization_338[0][0]    
activation_200 (Activation)      (None, None, None, 32 0           batch_normalization_339[0][0]    
mixed0 (Concatenate)             (None, None, None, 25 0           activation_194[0][0]             
conv2d_267 (Conv2D)              (None, None, None, 64 16384       mixed0[0][0]                     
batch_normalization_343 (BatchNo (None, None, None, 64 192         conv2d_267[0][0]                 
activation_204 (Activation)      (None, None, None, 64 0           batch_normalization_343[0][0]    
conv2d_265 (Conv2D)              (None, None, None, 48 12288       mixed0[0][0]                     
conv2d_268 (Conv2D)              (None, None, None, 96 55296       activation_204[0][0]             
batch_normalization_341 (BatchNo (None, None, None, 48 144         conv2d_265[0][0]                 
batch_normalization_344 (BatchNo (None, None, None, 96 288         conv2d_268[0][0]                 
activation_202 (Activation)      (None, None, None, 48 0           batch_normalization_341[0][0]    
activation_205 (Activation)      (None, None, None, 96 0           batch_normalization_344[0][0]    
average_pooling2d_35 (AveragePoo (None, None, None, 25 0           mixed0[0][0]                     
conv2d_264 (Conv2D)              (None, None, None, 64 16384       mixed0[0][0]                     
conv2d_266 (Conv2D)              (None, None, None, 64 76800       activation_202[0][0]             
conv2d_269 (Conv2D)              (None, None, None, 96 82944       activation_205[0][0]             
conv2d_270 (Conv2D)              (None, None, None, 64 16384       average_pooling2d_35[0][0]       
batch_normalization_340 (BatchNo (None, None, None, 64 192         conv2d_264[0][0]                 
batch_normalization_342 (BatchNo (None, None, None, 64 192         conv2d_266[0][0]                 
batch_normalization_345 (BatchNo (None, None, None, 96 288         conv2d_269[0][0]                 
batch_normalization_346 (BatchNo (None, None, None, 64 192         conv2d_270[0][0]                 
activation_201 (Activation)      (None, None, None, 64 0           batch_normalization_340[0][0]    
activation_203 (Activation)      (None, None, None, 64 0           batch_normalization_342[0][0]    
activation_206 (Activation)      (None, None, None, 96 0           batch_normalization_345[0][0]    
activation_207 (Activation)      (None, None, None, 64 0           batch_normalization_346[0][0]    
mixed1 (Concatenate)             (None, None, None, 28 0           activation_201[0][0]             
conv2d_274 (Conv2D)              (None, None, None, 64 18432       mixed1[0][0]                     
batch_normalization_350 (BatchNo (None, None, None, 64 192         conv2d_274[0][0]                 
activation_211 (Activation)      (None, None, None, 64 0           batch_normalization_350[0][0]    
conv2d_272 (Conv2D)              (None, None, None, 48 13824       mixed1[0][0]                     
conv2d_275 (Conv2D)              (None, None, None, 96 55296       activation_211[0][0]             
batch_normalization_348 (BatchNo (None, None, None, 48 144         conv2d_272[0][0]                 
batch_normalization_351 (BatchNo (None, None, None, 96 288         conv2d_275[0][0]                 
activation_209 (Activation)      (None, None, None, 48 0           batch_normalization_348[0][0]    
activation_212 (Activation)      (None, None, None, 96 0           batch_normalization_351[0][0]    
average_pooling2d_36 (AveragePoo (None, None, None, 28 0           mixed1[0][0]                     
conv2d_271 (Conv2D)              (None, None, None, 64 18432       mixed1[0][0]                     
conv2d_273 (Conv2D)              (None, None, None, 64 76800       activation_209[0][0]             
conv2d_276 (Conv2D)              (None, None, None, 96 82944       activation_212[0][0]             
conv2d_277 (Conv2D)              (None, None, None, 64 18432       average_pooling2d_36[0][0]       
batch_normalization_347 (BatchNo (None, None, None, 64 192         conv2d_271[0][0]                 
batch_normalization_349 (BatchNo (None, None, None, 64 192         conv2d_273[0][0]                 
batch_normalization_352 (BatchNo (None, None, None, 96 288         conv2d_276[0][0]                 
batch_normalization_353 (BatchNo (None, None, None, 64 192         conv2d_277[0][0]                 
activation_208 (Activation)      (None, None, None, 64 0           batch_normalization_347[0][0]    
activation_210 (Activation)      (None, None, None, 64 0           batch_normalization_349[0][0]    
activation_213 (Activation)      (None, None, None, 96 0           batch_normalization_352[0][0]    
activation_214 (Activation)      (None, None, None, 64 0           batch_normalization_353[0][0]    
mixed2 (Concatenate)             (None, None, None, 28 0           activation_208[0][0]             
conv2d_279 (Conv2D)              (None, None, None, 64 18432       mixed2[0][0]                     
batch_normalization_355 (BatchNo (None, None, None, 64 192         conv2d_279[0][0]                 
activation_216 (Activation)      (None, None, None, 64 0           batch_normalization_355[0][0]    
conv2d_280 (Conv2D)              (None, None, None, 96 55296       activation_216[0][0]             
batch_normalization_356 (BatchNo (None, None, None, 96 288         conv2d_280[0][0]                 
activation_217 (Activation)      (None, None, None, 96 0           batch_normalization_356[0][0]    
conv2d_278 (Conv2D)              (None, None, None, 38 995328      mixed2[0][0]                     
conv2d_281 (Conv2D)              (None, None, None, 96 82944       activation_217[0][0]             
batch_normalization_354 (BatchNo (None, None, None, 38 1152        conv2d_278[0][0]                 
batch_normalization_357 (BatchNo (None, None, None, 96 288         conv2d_281[0][0]                 
activation_215 (Activation)      (None, None, None, 38 0           batch_normalization_354[0][0]    
activation_218 (Activation)      (None, None, None, 96 0           batch_normalization_357[0][0]    
max_pooling2d_101 (MaxPooling2D) (None, None, None, 28 0           mixed2[0][0]                     
mixed3 (Concatenate)             (None, None, None, 76 0           activation_215[0][0]             
conv2d_286 (Conv2D)              (None, None, None, 12 98304       mixed3[0][0]                     
batch_normalization_362 (BatchNo (None, None, None, 12 384         conv2d_286[0][0]                 
activation_223 (Activation)      (None, None, None, 12 0           batch_normalization_362[0][0]    
conv2d_287 (Conv2D)              (None, None, None, 12 114688      activation_223[0][0]             
batch_normalization_363 (BatchNo (None, None, None, 12 384         conv2d_287[0][0]                 
activation_224 (Activation)      (None, None, None, 12 0           batch_normalization_363[0][0]    
conv2d_283 (Conv2D)              (None, None, None, 12 98304       mixed3[0][0]                     
conv2d_288 (Conv2D)              (None, None, None, 12 114688      activation_224[0][0]             
batch_normalization_359 (BatchNo (None, None, None, 12 384         conv2d_283[0][0]                 
batch_normalization_364 (BatchNo (None, None, None, 12 384         conv2d_288[0][0]                 
activation_220 (Activation)      (None, None, None, 12 0           batch_normalization_359[0][0]    
activation_225 (Activation)      (None, None, None, 12 0           batch_normalization_364[0][0]    
conv2d_284 (Conv2D)              (None, None, None, 12 114688      activation_220[0][0]             
conv2d_289 (Conv2D)              (None, None, None, 12 114688      activation_225[0][0]             
batch_normalization_360 (BatchNo (None, None, None, 12 384         conv2d_284[0][0]                 
batch_normalization_365 (BatchNo (None, None, None, 12 384         conv2d_289[0][0]                 
activation_221 (Activation)      (None, None, None, 12 0           batch_normalization_360[0][0]    
activation_226 (Activation)      (None, None, None, 12 0           batch_normalization_365[0][0]    
average_pooling2d_37 (AveragePoo (None, None, None, 76 0           mixed3[0][0]                     
conv2d_282 (Conv2D)              (None, None, None, 19 147456      mixed3[0][0]                     
conv2d_285 (Conv2D)              (None, None, None, 19 172032      activation_221[0][0]             
conv2d_290 (Conv2D)              (None, None, None, 19 172032      activation_226[0][0]             
conv2d_291 (Conv2D)              (None, None, None, 19 147456      average_pooling2d_37[0][0]       
batch_normalization_358 (BatchNo (None, None, None, 19 576         conv2d_282[0][0]                 
batch_normalization_361 (BatchNo (None, None, None, 19 576         conv2d_285[0][0]                 
batch_normalization_366 (BatchNo (None, None, None, 19 576         conv2d_290[0][0]                 
batch_normalization_367 (BatchNo (None, None, None, 19 576         conv2d_291[0][0]                 
activation_219 (Activation)      (None, None, None, 19 0           batch_normalization_358[0][0]    
activation_222 (Activation)      (None, None, None, 19 0           batch_normalization_361[0][0]    
activation_227 (Activation)      (None, None, None, 19 0           batch_normalization_366[0][0]    
activation_228 (Activation)      (None, None, None, 19 0           batch_normalization_367[0][0]    
mixed4 (Concatenate)             (None, None, None, 76 0           activation_219[0][0]             
conv2d_296 (Conv2D)              (None, None, None, 16 122880      mixed4[0][0]                     
batch_normalization_372 (BatchNo (None, None, None, 16 480         conv2d_296[0][0]                 
activation_233 (Activation)      (None, None, None, 16 0           batch_normalization_372[0][0]    
conv2d_297 (Conv2D)              (None, None, None, 16 179200      activation_233[0][0]             
batch_normalization_373 (BatchNo (None, None, None, 16 480         conv2d_297[0][0]                 
activation_234 (Activation)      (None, None, None, 16 0           batch_normalization_373[0][0]    
conv2d_293 (Conv2D)              (None, None, None, 16 122880      mixed4[0][0]                     
conv2d_298 (Conv2D)              (None, None, None, 16 179200      activation_234[0][0]             
batch_normalization_369 (BatchNo (None, None, None, 16 480         conv2d_293[0][0]                 
batch_normalization_374 (BatchNo (None, None, None, 16 480         conv2d_298[0][0]                 
activation_230 (Activation)      (None, None, None, 16 0           batch_normalization_369[0][0]    
activation_235 (Activation)      (None, None, None, 16 0           batch_normalization_374[0][0]    
conv2d_294 (Conv2D)              (None, None, None, 16 179200      activation_230[0][0]             
conv2d_299 (Conv2D)              (None, None, None, 16 179200      activation_235[0][0]             
batch_normalization_370 (BatchNo (None, None, None, 16 480         conv2d_294[0][0]                 
batch_normalization_375 (BatchNo (None, None, None, 16 480         conv2d_299[0][0]                 
activation_231 (Activation)      (None, None, None, 16 0           batch_normalization_370[0][0]    
activation_236 (Activation)      (None, None, None, 16 0           batch_normalization_375[0][0]    
average_pooling2d_38 (AveragePoo (None, None, None, 76 0           mixed4[0][0]                     
conv2d_292 (Conv2D)              (None, None, None, 19 147456      mixed4[0][0]                     
conv2d_295 (Conv2D)              (None, None, None, 19 215040      activation_231[0][0]             
conv2d_300 (Conv2D)              (None, None, None, 19 215040      activation_236[0][0]             
conv2d_301 (Conv2D)              (None, None, None, 19 147456      average_pooling2d_38[0][0]       
batch_normalization_368 (BatchNo (None, None, None, 19 576         conv2d_292[0][0]                 
batch_normalization_371 (BatchNo (None, None, None, 19 576         conv2d_295[0][0]                 
batch_normalization_376 (BatchNo (None, None, None, 19 576         conv2d_300[0][0]                 
batch_normalization_377 (BatchNo (None, None, None, 19 576         conv2d_301[0][0]                 
activation_229 (Activation)      (None, None, None, 19 0           batch_normalization_368[0][0]    
activation_232 (Activation)      (None, None, None, 19 0           batch_normalization_371[0][0]    
activation_237 (Activation)      (None, None, None, 19 0           batch_normalization_376[0][0]    
activation_238 (Activation)      (None, None, None, 19 0           batch_normalization_377[0][0]    
mixed5 (Concatenate)             (None, None, None, 76 0           activation_229[0][0]             
conv2d_306 (Conv2D)              (None, None, None, 16 122880      mixed5[0][0]                     
batch_normalization_382 (BatchNo (None, None, None, 16 480         conv2d_306[0][0]                 
activation_243 (Activation)      (None, None, None, 16 0           batch_normalization_382[0][0]    
conv2d_307 (Conv2D)              (None, None, None, 16 179200      activation_243[0][0]             
batch_normalization_383 (BatchNo (None, None, None, 16 480         conv2d_307[0][0]                 
activation_244 (Activation)      (None, None, None, 16 0           batch_normalization_383[0][0]    
conv2d_303 (Conv2D)              (None, None, None, 16 122880      mixed5[0][0]                     
conv2d_308 (Conv2D)              (None, None, None, 16 179200      activation_244[0][0]             
batch_normalization_379 (BatchNo (None, None, None, 16 480         conv2d_303[0][0]                 
batch_normalization_384 (BatchNo (None, None, None, 16 480         conv2d_308[0][0]                 
activation_240 (Activation)      (None, None, None, 16 0           batch_normalization_379[0][0]    
activation_245 (Activation)      (None, None, None, 16 0           batch_normalization_384[0][0]    
conv2d_304 (Conv2D)              (None, None, None, 16 179200      activation_240[0][0]             
conv2d_309 (Conv2D)              (None, None, None, 16 179200      activation_245[0][0]             
batch_normalization_380 (BatchNo (None, None, None, 16 480         conv2d_304[0][0]                 
batch_normalization_385 (BatchNo (None, None, None, 16 480         conv2d_309[0][0]                 
activation_241 (Activation)      (None, None, None, 16 0           batch_normalization_380[0][0]    
activation_246 (Activation)      (None, None, None, 16 0           batch_normalization_385[0][0]    
average_pooling2d_39 (AveragePoo (None, None, None, 76 0           mixed5[0][0]                     
conv2d_302 (Conv2D)              (None, None, None, 19 147456      mixed5[0][0]                     
conv2d_305 (Conv2D)              (None, None, None, 19 215040      activation_241[0][0]             
conv2d_310 (Conv2D)              (None, None, None, 19 215040      activation_246[0][0]             
conv2d_311 (Conv2D)              (None, None, None, 19 147456      average_pooling2d_39[0][0]       
batch_normalization_378 (BatchNo (None, None, None, 19 576         conv2d_302[0][0]                 
batch_normalization_381 (BatchNo (None, None, None, 19 576         conv2d_305[0][0]                 
batch_normalization_386 (BatchNo (None, None, None, 19 576         conv2d_310[0][0]                 
batch_normalization_387 (BatchNo (None, None, None, 19 576         conv2d_311[0][0]                 
activation_239 (Activation)      (None, None, None, 19 0           batch_normalization_378[0][0]    
activation_242 (Activation)      (None, None, None, 19 0           batch_normalization_381[0][0]    
activation_247 (Activation)      (None, None, None, 19 0           batch_normalization_386[0][0]    
activation_248 (Activation)      (None, None, None, 19 0           batch_normalization_387[0][0]    
mixed6 (Concatenate)             (None, None, None, 76 0           activation_239[0][0]             
conv2d_316 (Conv2D)              (None, None, None, 19 147456      mixed6[0][0]                     
batch_normalization_392 (BatchNo (None, None, None, 19 576         conv2d_316[0][0]                 
activation_253 (Activation)      (None, None, None, 19 0           batch_normalization_392[0][0]    
conv2d_317 (Conv2D)              (None, None, None, 19 258048      activation_253[0][0]             
batch_normalization_393 (BatchNo (None, None, None, 19 576         conv2d_317[0][0]                 
activation_254 (Activation)      (None, None, None, 19 0           batch_normalization_393[0][0]    
conv2d_313 (Conv2D)              (None, None, None, 19 147456      mixed6[0][0]                     
conv2d_318 (Conv2D)              (None, None, None, 19 258048      activation_254[0][0]             
batch_normalization_389 (BatchNo (None, None, None, 19 576         conv2d_313[0][0]                 
batch_normalization_394 (BatchNo (None, None, None, 19 576         conv2d_318[0][0]                 
activation_250 (Activation)      (None, None, None, 19 0           batch_normalization_389[0][0]    
activation_255 (Activation)      (None, None, None, 19 0           batch_normalization_394[0][0]    
conv2d_314 (Conv2D)              (None, None, None, 19 258048      activation_250[0][0]             
conv2d_319 (Conv2D)              (None, None, None, 19 258048      activation_255[0][0]             
batch_normalization_390 (BatchNo (None, None, None, 19 576         conv2d_314[0][0]                 
batch_normalization_395 (BatchNo (None, None, None, 19 576         conv2d_319[0][0]                 
activation_251 (Activation)      (None, None, None, 19 0           batch_normalization_390[0][0]    
activation_256 (Activation)      (None, None, None, 19 0           batch_normalization_395[0][0]    
average_pooling2d_40 (AveragePoo (None, None, None, 76 0           mixed6[0][0]                     
conv2d_312 (Conv2D)              (None, None, None, 19 147456      mixed6[0][0]                     
conv2d_315 (Conv2D)              (None, None, None, 19 258048      activation_251[0][0]             
conv2d_320 (Conv2D)              (None, None, None, 19 258048      activation_256[0][0]             
conv2d_321 (Conv2D)              (None, None, None, 19 147456      average_pooling2d_40[0][0]       
batch_normalization_388 (BatchNo (None, None, None, 19 576         conv2d_312[0][0]                 
batch_normalization_391 (BatchNo (None, None, None, 19 576         conv2d_315[0][0]                 
batch_normalization_396 (BatchNo (None, None, None, 19 576         conv2d_320[0][0]                 
batch_normalization_397 (BatchNo (None, None, None, 19 576         conv2d_321[0][0]                 
activation_249 (Activation)      (None, None, None, 19 0           batch_normalization_388[0][0]    
activation_252 (Activation)      (None, None, None, 19 0           batch_normalization_391[0][0]    
activation_257 (Activation)      (None, None, None, 19 0           batch_normalization_396[0][0]    
activation_258 (Activation)      (None, None, None, 19 0           batch_normalization_397[0][0]    
mixed7 (Concatenate)             (None, None, None, 76 0           activation_249[0][0]             
conv2d_324 (Conv2D)              (None, None, None, 19 147456      mixed7[0][0]                     
batch_normalization_400 (BatchNo (None, None, None, 19 576         conv2d_324[0][0]                 
activation_261 (Activation)      (None, None, None, 19 0           batch_normalization_400[0][0]    
conv2d_325 (Conv2D)              (None, None, None, 19 258048      activation_261[0][0]             
batch_normalization_401 (BatchNo (None, None, None, 19 576         conv2d_325[0][0]                 
activation_262 (Activation)      (None, None, None, 19 0           batch_normalization_401[0][0]    
conv2d_322 (Conv2D)              (None, None, None, 19 147456      mixed7[0][0]                     
conv2d_326 (Conv2D)              (None, None, None, 19 258048      activation_262[0][0]             
batch_normalization_398 (BatchNo (None, None, None, 19 576         conv2d_322[0][0]                 
batch_normalization_402 (BatchNo (None, None, None, 19 576         conv2d_326[0][0]                 
activation_259 (Activation)      (None, None, None, 19 0           batch_normalization_398[0][0]    
activation_263 (Activation)      (None, None, None, 19 0           batch_normalization_402[0][0]    
conv2d_323 (Conv2D)              (None, None, None, 32 552960      activation_259[0][0]             
conv2d_327 (Conv2D)              (None, None, None, 19 331776      activation_263[0][0]             
batch_normalization_399 (BatchNo (None, None, None, 32 960         conv2d_323[0][0]                 
batch_normalization_403 (BatchNo (None, None, None, 19 576         conv2d_327[0][0]                 
activation_260 (Activation)      (None, None, None, 32 0           batch_normalization_399[0][0]    
activation_264 (Activation)      (None, None, None, 19 0           batch_normalization_403[0][0]    
max_pooling2d_102 (MaxPooling2D) (None, None, None, 76 0           mixed7[0][0]                     
mixed8 (Concatenate)             (None, None, None, 12 0           activation_260[0][0]             
conv2d_332 (Conv2D)              (None, None, None, 44 573440      mixed8[0][0]                     
batch_normalization_408 (BatchNo (None, None, None, 44 1344        conv2d_332[0][0]                 
activation_269 (Activation)      (None, None, None, 44 0           batch_normalization_408[0][0]    
conv2d_329 (Conv2D)              (None, None, None, 38 491520      mixed8[0][0]                     
conv2d_333 (Conv2D)              (None, None, None, 38 1548288     activation_269[0][0]             
batch_normalization_405 (BatchNo (None, None, None, 38 1152        conv2d_329[0][0]                 
batch_normalization_409 (BatchNo (None, None, None, 38 1152        conv2d_333[0][0]                 
activation_266 (Activation)      (None, None, None, 38 0           batch_normalization_405[0][0]    
activation_270 (Activation)      (None, None, None, 38 0           batch_normalization_409[0][0]    
conv2d_330 (Conv2D)              (None, None, None, 38 442368      activation_266[0][0]             
conv2d_331 (Conv2D)              (None, None, None, 38 442368      activation_266[0][0]             
conv2d_334 (Conv2D)              (None, None, None, 38 442368      activation_270[0][0]             
conv2d_335 (Conv2D)              (None, None, None, 38 442368      activation_270[0][0]             
average_pooling2d_41 (AveragePoo (None, None, None, 12 0           mixed8[0][0]                     
conv2d_328 (Conv2D)              (None, None, None, 32 409600      mixed8[0][0]                     
batch_normalization_406 (BatchNo (None, None, None, 38 1152        conv2d_330[0][0]                 
batch_normalization_407 (BatchNo (None, None, None, 38 1152        conv2d_331[0][0]                 
batch_normalization_410 (BatchNo (None, None, None, 38 1152        conv2d_334[0][0]                 
batch_normalization_411 (BatchNo (None, None, None, 38 1152        conv2d_335[0][0]                 
conv2d_336 (Conv2D)              (None, None, None, 19 245760      average_pooling2d_41[0][0]       
batch_normalization_404 (BatchNo (None, None, None, 32 960         conv2d_328[0][0]                 
activation_267 (Activation)      (None, None, None, 38 0           batch_normalization_406[0][0]    
activation_268 (Activation)      (None, None, None, 38 0           batch_normalization_407[0][0]    
activation_271 (Activation)      (None, None, None, 38 0           batch_normalization_410[0][0]    
activation_272 (Activation)      (None, None, None, 38 0           batch_normalization_411[0][0]    
batch_normalization_412 (BatchNo (None, None, None, 19 576         conv2d_336[0][0]                 
activation_265 (Activation)      (None, None, None, 32 0           batch_normalization_404[0][0]    
mixed9_0 (Concatenate)           (None, None, None, 76 0           activation_267[0][0]             
concatenate_5 (Concatenate)      (None, None, None, 76 0           activation_271[0][0]             
activation_273 (Activation)      (None, None, None, 19 0           batch_normalization_412[0][0]    
mixed9 (Concatenate)             (None, None, None, 20 0           activation_265[0][0]             
conv2d_341 (Conv2D)              (None, None, None, 44 917504      mixed9[0][0]                     
batch_normalization_417 (BatchNo (None, None, None, 44 1344        conv2d_341[0][0]                 
activation_278 (Activation)      (None, None, None, 44 0           batch_normalization_417[0][0]    
conv2d_338 (Conv2D)              (None, None, None, 38 786432      mixed9[0][0]                     
conv2d_342 (Conv2D)              (None, None, None, 38 1548288     activation_278[0][0]             
batch_normalization_414 (BatchNo (None, None, None, 38 1152        conv2d_338[0][0]                 
batch_normalization_418 (BatchNo (None, None, None, 38 1152        conv2d_342[0][0]                 
activation_275 (Activation)      (None, None, None, 38 0           batch_normalization_414[0][0]    
activation_279 (Activation)      (None, None, None, 38 0           batch_normalization_418[0][0]    
conv2d_339 (Conv2D)              (None, None, None, 38 442368      activation_275[0][0]             
conv2d_340 (Conv2D)              (None, None, None, 38 442368      activation_275[0][0]             
conv2d_343 (Conv2D)              (None, None, None, 38 442368      activation_279[0][0]             
conv2d_344 (Conv2D)              (None, None, None, 38 442368      activation_279[0][0]             
average_pooling2d_42 (AveragePoo (None, None, None, 20 0           mixed9[0][0]                     
conv2d_337 (Conv2D)              (None, None, None, 32 655360      mixed9[0][0]                     
batch_normalization_415 (BatchNo (None, None, None, 38 1152        conv2d_339[0][0]                 
batch_normalization_416 (BatchNo (None, None, None, 38 1152        conv2d_340[0][0]                 
batch_normalization_419 (BatchNo (None, None, None, 38 1152        conv2d_343[0][0]                 
batch_normalization_420 (BatchNo (None, None, None, 38 1152        conv2d_344[0][0]                 
conv2d_345 (Conv2D)              (None, None, None, 19 393216      average_pooling2d_42[0][0]       
batch_normalization_413 (BatchNo (None, None, None, 32 960         conv2d_337[0][0]                 
activation_276 (Activation)      (None, None, None, 38 0           batch_normalization_415[0][0]    
activation_277 (Activation)      (None, None, None, 38 0           batch_normalization_416[0][0]    
activation_280 (Activation)      (None, None, None, 38 0           batch_normalization_419[0][0]    
activation_281 (Activation)      (None, None, None, 38 0           batch_normalization_420[0][0]    
batch_normalization_421 (BatchNo (None, None, None, 19 576         conv2d_345[0][0]                 
activation_274 (Activation)      (None, None, None, 32 0           batch_normalization_413[0][0]    
mixed9_1 (Concatenate)           (None, None, None, 76 0           activation_276[0][0]             
concatenate_6 (Concatenate)      (None, None, None, 76 0           activation_280[0][0]             
activation_282 (Activation)      (None, None, None, 19 0           batch_normalization_421[0][0]    
mixed10 (Concatenate)            (None, None, None, 20 0           activation_274[0][0]             
global_average_pooling2d_11 (Glo (None, 2048)          0           mixed10[0][0]                    
dense_77 (Dense)                 (None, 1024)          2098176     global_average_pooling2d_11[0][0]
dense_78 (Dense)                 (None, 10)            10250       dense_77[0][0]                   
Total params: 23,911,210
Trainable params: 2,108,426
Non-trainable params: 21,802,784

Ex4_history =,train_labels, batch_size=batch_size, epochs=3, 
                              validation_data =(valid_data,valid_labels))

Train on 16774 samples, validate on 5000 samples
Epoch 1/3
16774/16774 [==============================] - 3654s - loss: 1.4326 - acc: 0.5639 - val_loss: 0.8979 - val_acc: 0.7318
Epoch 2/3
16774/16774 [==============================] - 3654s - loss: 0.7781 - acc: 0.7829 - val_loss: 0.5889 - val_acc: 0.8428
Epoch 3/3
16774/16774 [==============================] - 3639s - loss: 0.5853 - acc: 0.8370 - val_loss: 0.4632 - val_acc: 0.8762

Ex4_history =,train_labels, batch_size=batch_size, epochs=5, 
                              validation_data =(valid_data,valid_labels))

Train on 16774 samples, validate on 5000 samples
Epoch 1/5
16774/16774 [==============================] - 3613s - loss: 0.4872 - acc: 0.8605 - val_loss: 0.4367 - val_acc: 0.8754
Epoch 2/5
16774/16774 [==============================] - 3595s - loss: 0.4147 - acc: 0.8812 - val_loss: 0.3730 - val_acc: 0.8940
Epoch 3/5
16774/16774 [==============================] - 3686s - loss: 0.3754 - acc: 0.8894 - val_loss: 0.3546 - val_acc: 0.8990
Epoch 4/5
16774/16774 [==============================] - 3621s - loss: 0.3242 - acc: 0.9085 - val_loss: 0.3383 - val_acc: 0.8996
Epoch 5/5
16774/16774 [==============================] - 4035s - loss: 0.2987 - acc: 0.9158 - val_loss: 0.2790 - val_acc: 0.9236

predictions_Ex4 = IcepV3_model.predict(train_data)
predictions_Ex4 = np.argmax(predictions_Ex4, axis=-1)
train_labels_Ex4 = np.argmax(train_labels, axis=-1)
f1_score(train_labels_Ex4, predictions_Ex4, average=None)

array([ 1.  ,  0.99,  0.99,  1.  ,  1.  ,  1.  ,  1.  ,  0.99,  1.  ,  1.  ])

predictions_Ex4_v = IcepV3_model.predict(valid_data)
predictions_Ex4_v = np.argmax(predictions_Ex4_v, axis=-1)
valid_labels_Ex4 = np.argmax(valid_labels, axis=-1)
f1_score(valid_labels_Ex4, predictions_Ex4_v, average=None)

array([ 0.99,  0.99,  0.98,  1.  ,  1.  ,  1.  ,  0.99,  0.98,  0.99,  0.99])

cnf_matrix = confusion_matrix(train_labels_Ex4, predictions_Ex4)

# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
                      title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
                      title='Normalized confusion matrix')

Confusion matrix, without normalization
[[1986    0    0    0    0    1    1    0    1    0]
 [   0 1762    0    0    0    0    0    1    0    4]
 [   0   30 1771    0    0    0    0   15    0    1]
 [   0    0    0 1846    0    0    0    0    0    0]
 [   0    0    0    0 1826    0    0    0    0    0]
 [   0    0    0    0    0 1811    0    1    0    0]
 [   0    0    0    0    0    0 1825    0    0    0]
 [   5    0    0    0    0    0    0 1495    1    1]
 [   0    0    0    0    1    0    0    3 1148    0]
 [   0    0    0    0    0    1    0    0    0 1237]]
Normalized confusion matrix
[[  9.98e-01   0.00e+00   0.00e+00   0.00e+00   0.00e+00   5.03e-04
    5.03e-04   0.00e+00   5.03e-04   0.00e+00]
 [  0.00e+00   9.97e-01   0.00e+00   0.00e+00   0.00e+00   0.00e+00
    0.00e+00   5.66e-04   0.00e+00   2.26e-03]
 [  0.00e+00   1.65e-02   9.75e-01   0.00e+00   0.00e+00   0.00e+00
    0.00e+00   8.26e-03   0.00e+00   5.50e-04]
 [  0.00e+00   0.00e+00   0.00e+00   1.00e+00   0.00e+00   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  0.00e+00   0.00e+00   0.00e+00   0.00e+00   1.00e+00   0.00e+00
    0.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  0.00e+00   0.00e+00   0.00e+00   0.00e+00   0.00e+00   9.99e-01
    0.00e+00   5.52e-04   0.00e+00   0.00e+00]
 [  0.00e+00   0.00e+00   0.00e+00   0.00e+00   0.00e+00   0.00e+00
    1.00e+00   0.00e+00   0.00e+00   0.00e+00]
 [  3.33e-03   0.00e+00   0.00e+00   0.00e+00   0.00e+00   0.00e+00
    0.00e+00   9.95e-01   6.66e-04   6.66e-04]
 [  0.00e+00   0.00e+00   0.00e+00   0.00e+00   8.68e-04   0.00e+00
    0.00e+00   2.60e-03   9.97e-01   0.00e+00]
 [  0.00e+00   0.00e+00   0.00e+00   0.00e+00   0.00e+00   8.08e-04
    0.00e+00   0.00e+00   0.00e+00   9.99e-01]]

cnf_matrix = confusion_matrix(valid_labels_Ex4, predictions_Ex4_v)

# Plot non-normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names,
                      title='Confusion matrix, without normalization')
# Plot normalized confusion matrix
plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True,
                      title='Normalized confusion matrix')

Confusion matrix, without normalization
[[495   0   0   2   0   0   1   2   0   0]
 [  0 499   0   0   0   0   0   1   0   0]
 [  0   6 486   0   1   0   3   4   0   0]
 [  0   0   0 500   0   0   0   0   0   0]
 [  0   0   0   0 500   0   0   0   0   0]
 [  0   0   0   1   0 498   0   1   0   0]
 [  0   0   0   0   0   0 500   0   0   0]
 [  5   0   0   0   0   0   0 495   0   0]
 [  1   0   1   0   0   1   2   3 488   4]
 [  0   0   0   2   0   0   0   0   2 496]]
Normalized confusion matrix
[[ 0.99  0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    1.    0.    0.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.01  0.97  0.    0.    0.    0.01  0.01  0.    0.  ]
 [ 0.    0.    0.    1.    0.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    1.    0.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    1.    0.    0.    0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.    1.    0.    0.    0.  ]
 [ 0.01  0.    0.    0.    0.    0.    0.    0.99  0.    0.  ]
 [ 0.    0.    0.    0.    0.    0.    0.    0.01  0.98  0.01]
 [ 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.99]]

We are already overfitting the model, validation accuaracy stays around 0.91 and it seems like Inception3 net without modification further layers will not preduce a good results.

Further finetuning

Next, we will further fine-tune InceptionV3 models. We will freeze the first 172 layers and train the rest

# the first 172 layers and unfreeze the rest:
for layer in IcepV3_model.layers[:172]:
    layer.trainable = False
for layer in IcepV3_model.layers[172:]:
    layer.trainable = True

IcepV3_model.compile(Adam(lr=10e-5), loss='categorical_crossentropy', metrics=['accuracy'])

Ex5_history =,train_labels, batch_size=batch_size, epochs=5, 
                              validation_data =(valid_data,valid_labels))

Train on 16774 samples, validate on 5000 samples
Epoch 1/5
16774/16774 [==============================] - 5627s - loss: 0.1324 - acc: 0.9589 - val_loss: 0.0700 - val_acc: 0.9808
Epoch 2/5
16774/16774 [==============================] - 5203s - loss: 0.0212 - acc: 0.9937 - val_loss: 0.0243 - val_acc: 0.9918
Epoch 3/5
16774/16774 [==============================] - 4885s - loss: 0.0209 - acc: 0.9937 - val_loss: 0.0252 - val_acc: 0.9940
Epoch 4/5
16774/16774 [==============================] - 4807s - loss: 0.0318 - acc: 0.9914 - val_loss: 0.0245 - val_acc: 0.9934
Epoch 5/5
16774/16774 [==============================] - 4819s - loss: 0.0093 - acc: 0.9973 - val_loss: 0.0324 - val_acc: 0.9914

predictions_Ex4 = IcepV3_model.predict(train_data)
predictions_Ex4 = np.argmax(predictions_Ex4, axis=-1)
train_labels_Ex4 = np.argmax(train_labels, axis=-1)
f1_score(train_labels_Ex4, predictions_Ex4, average=None)

predictions_Ex4_v = IcepV3_model.predict(train_data)
predictions_Ex4_v = np.argmax(predictions_Ex4_v, axis=-1)
valid_labels_Ex4 = np.argmax(train_labels, axis=-1)
f1_score(valid_labels_Ex4, predictions_Ex4_v, average=None)

import plotly as py
import plotly.graph_objs as go
feature_type = ["CNN - benchmark",
                "CNN -  with Conv2D",
                "CNN - with Dropout",
               "InceptionV3 - finetuning"]
acc_scores = [0.9618, 0.9978, 0.9842,0.1098,0.9236,0.9914]
loss_score = [0.1852,0.0089, 0.0555, 14.3356, 0.2790, 0.0324]

# Create a trace
trace = go.Bar(
    x = feature_type,
    y = loss_score,
    text = loss_score,
    textposition = 'auto',
    line=dict(color='rgb(8,48,107)', width=1.5),),
    #mode = 'lines+markers',
    #name = 'lines+markers'

layout = go.Layout(
    title="Model's loss",
        title='CNN Models'

data = [trace]
fig = go.Figure(data=data, layout=layout)
py.offline.iplot(fig, filename="bar_chart")

import plotly as py
import plotly.graph_objs as go
feature_type = ["CNN - benchmark",
                "CNN -  with Conv2D",
                "CNN - with Dropout",
               "InceptionV3 - finetuning"]
acc_scores = [0.9618, 0.9978, 0.9842,0.1098,0.9236,0.9914]
loss_score = [0.1852,0.0089, 0.0555, 14.3356, 0.2790, 0.0324]

# Create a trace
trace = go.Bar(
    x = feature_type,
    y = acc_scores,
    text = acc_scores,
    textposition = 'auto',
    line=dict(color='rgb(8,48,107)', width=1.5),),
    #mode = 'lines+markers',
    #name = 'lines+markers'

layout = go.Layout(
    title="Model's accuracy performance",
        title='CNN Models'

data = [trace]
fig = go.Figure(data=data, layout=layout)
py.offline.iplot(fig, filename="bar_chart")

def analyze_content(path):
    Stats about the content of the images
    df = pd.read_csv(path)
    columns = ['c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7', 'c8', 'c9']
    subjects = df.drop_duplicates('subject')['subject'].tolist()
    new_df = pd.DataFrame(index=subjects, columns=columns)
    classes = df.drop_duplicates('classname')['classname'].tolist()

    print("Number of drivers: %d" % len(subjects))

    for subject in subjects:
        print("Analyzing subject %s" % subject)
        subject_data = df.loc[df['subject'] == subject]
        row = []
        for class_type in classes:
            images_data = subject_data.loc[df['classname'] == class_type]
        new_df.loc[subject] = row

    print(new_df.describe().loc[['mean', 'std', 'min', 'max']])

  File "<ipython-input-140-9781f4f097b0>", line 16
    row = [%pwd]
SyntaxError: invalid syntax

Number of drivers: 26
Analyzing subject p002
Analyzing subject p012
Analyzing subject p014
Analyzing subject p015
Analyzing subject p016
Analyzing subject p021
Analyzing subject p022
Analyzing subject p024
Analyzing subject p026
Analyzing subject p035
Analyzing subject p039
Analyzing subject p041
Analyzing subject p042
Analyzing subject p045
Analyzing subject p047
Analyzing subject p049
Analyzing subject p050
Analyzing subject p051
Analyzing subject p052
Analyzing subject p056
Analyzing subject p061
Analyzing subject p064
Analyzing subject p066
Analyzing subject p072
Analyzing subject p075
Analyzing subject p081
       c0   c1   c2   c3   c4   c5   c6   c7   c8   c9
p002   76   74   86   79   84   76   83   72   44   51
p012   84   95   91   89   97   96   75   72   62   62
p014  100  103  100  100  103  102  101   77   38   52
p015   79   85   88   94  101  101   99   81   86   61
p016  111  102  101  128  104  104  108  101   99  120
p021  135  131  127  128  132  130  126   98   99  131
p022  129  129  128  129  130  130  131   98   98  131
p024  130  129  128  130  129  131  129  101   99  120
p026  130  129  130  131  126  130  128   97   97   98
p035   94   81   88   89   89   89   94   87   56   81
p039   65   63   70   65   62   64   63   64   70   65
p041   60   64   60   60   60   61   61   61   59   59
p042   59   59   60   59   58   59   59   59   59   60
p045   75   75   76   75   75   76   71   67   66   68
p047   80   91   81   86   82   87   81   82   82   83
p049   84   85  119  110  109  116  119   74   79  116
p050  123   45   52   98   83   91   82   81   65   70
p051  182   81   81   83   81   83   95   80   62   92
p052   72   71   84   75   72   72   77   71   71   75
p056   81   80   80   78   82   81   80   74   83   75
p061   84   81   81   83   79   81   80   79   81   80
p064   83   81   83   84   86   85   82   79   81   76
p066  129  100  106  101  102  101  105   86  114   90
p072   63   62   36   31   34    6   35    2   21   56
p075   81   81   85   79   89   79   82   82   79   77
p081  100   90   96   82   77   81   79   77   61   80
              c0          c1          c2          c3          c4          c5  \
mean   95.730769   87.192308   89.115385   90.230769   89.461538   88.923077   
std    29.747683   22.847353   24.020536   24.826289   23.839850   27.161625   
min    59.000000   45.000000   36.000000   31.000000   34.000000    6.000000   
max   182.000000  131.000000  130.000000  131.000000  132.000000  131.000000   

              c6          c7          c8          c9  
mean   89.423077   77.000000   73.500000   81.884615  
std    24.100080   19.279004   21.369605   24.045502  
min    35.000000    2.000000   21.000000   51.000000  
max   131.000000  101.000000  114.000000  131.000000  

